NUS at WMT09: Domain Adaptation Experiments for English-Spanish Machine Translation of News Commentary Text
نویسندگان
چکیده
We describe the system developed by the team of the National University of Singapore for English to Spanish machine translation of News Commentary text for the WMT09 Shared Translation Task. Our approach is based on domain adaptation, combining a small in-domain News Commentary bi-text and a large out-of-domain one from the Europarl corpus, from which we built and combined two separate phrase tables. We further combined two language models (in-domain and out-of-domain), and we experimented with cognates, improved tokenization and recasing, achieving the highest lowercased NIST score of 6.963 and the second best lowercased Bleu score of 24.91% for training without using additional external data for English-toSpanish translation at the shared task.
منابع مشابه
Improving English-Spanish Statistical Machine Translation: Experiments in Domain Adaptation, Sentence Paraphrasing, Tokenization, and Recasing
We describe the experiments of the UC Berkeley team on improving English-Spanish machine translation of news text, as part of the WMT’08 Shared Translation Task. We experiment with domain adaptation, combining a small in-domain news bi-text and a large out-of-domain one from the Europarl corpus, building two separate phrase translation models and two separate language models. We further add a t...
متن کاملNICT@WMT09: Model Adaptation and Transliteration for Spanish-English SMT
This paper describes the NICT statistical machine translation (SMT) system used for the WMT 2009 Shared Task (WMT09) evaluation. We participated in the Spanish-English translation task. The focus of this year’s participation was to investigate model adaptation and transliteration techniques in order to improve the translation quality of the baseline phrasebased SMT system.
متن کاملPJIIT's systems for WMT 2017 Conference
In this paper, we attempt to improve Statistical Machine Translation (SMT) systems between Czech, Latvian and English in WNT’17 News translation task. We also participated in the Biomedical task and produces translation engines from English into Polish, Czech, German, Spanish, French, Hungarian, Romanian and Swedish. To accomplish this, we performed translation model training, created adaptatio...
متن کاملQCRI at WMT12: Experiments in Spanish-English and German-English Machine Translation of News Text
We describe the systems developed by the team of the Qatar Computing Research Institute for the WMT12 Shared Translation Task. We used a phrase-based statistical machine translation model with several non-standard settings, most notably tuning data selection and phrase table combination. The evaluation results show that we rank second in BLEU and TER for Spanish-English, and in the top tier for...
متن کاملThe University of Maryland Statistical Machine Translation System for the Fourth Workshop on Machine Translation
This paper describes the techniques we explored to improve the translation of news text in the German-English and Hungarian-English tracks of the WMT09 shared translation task. Beginning with a convention hierarchical phrase-based system, we found benefits for using word segmentation lattices as input, explicit generation of beginning and end of sentence markers, minimum Bayes risk decoding, an...
متن کامل